Data fusion using Bayesian theory and reinforcement learning method
نویسندگان
چکیده
منابع مشابه
Using trajectory data to improve bayesian optimization for reinforcement learning
Recently, Bayesian Optimization (BO) has been used to successfully optimize parametric policies in several challenging Reinforcement Learning (RL) applications. BO is attractive for this problem because it exploits Bayesian prior information about the expected return and exploits this knowledge to select new policies to execute. Effectively, the BO framework for policy search addresses the expl...
متن کاملImproving Bayesian Reinforcement Learning Using Transition Abstraction
Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling p...
متن کاملBayesian Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an expert. IRL is motivated by situations where knowledge of the rewards is a goal by itself (as in preference elicitation) and by the task of apprenticeship learning (learning policies from an expert). In this paper we sh...
متن کاملLinear Bayesian Reinforcement Learning
This paper proposes a simple linear Bayesian approach to reinforcement learning. We show that with an appropriate basis, a Bayesian linear Gaussian model is sufficient for accurately estimating the system dynamics, and in particular when we allow for correlated noise. Policies are estimated by first sampling a transition model from the current posterior, and then performing approximate dynamic ...
متن کاملBayesian Hierarchical Reinforcement Learning
We describe an approach to incorporating Bayesian priors in the MAXQ framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach resu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Science China Information Sciences
سال: 2020
ISSN: 1674-733X,1869-1919
DOI: 10.1007/s11432-019-2751-4